Add robust eigh_v2 problem#163
Draft
msaroufim wants to merge 1 commit into
Draft
Conversation
40ca746 to
b208d15
Compare
Add a separate eigh_v2 leaderboard that keeps the existing eigh problem untouched while carrying the stricter checker and benchmark-integrity hardening from the open eigh follow-ups. The v2 evaluator regenerates inputs for scored benchmark iterations, rejects physically impossible reported times, and keeps profile mode from the current upstream evaluator. The v2 checker requires plain tensor outputs and adds an explicit eigenvalue comparison against torch.linalg.eigvalsh(A). The ranked set is trimmed to ten cases and repeats the central 512x512 shape across dense, mixed, rank-deficient, clustered, and row-scaled distributions so shape-only precision routing is less useful than inspecting matrix quality. Credit: this consolidates ideas and fixes from #156, #159, #160, and #161. Co-Authored-By: Bryce Adelstein Lelbach <brycelelbach@gmail.com>
133bded to
9bcefc4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
eigh_v2linalg problem and leaderboard entry without changing the existingeighimplementation or rankings.mainvia Add Eigh profile mode #158.torch.linalg.eigvalsh(A), and orthogonality. Reconstruction is skipped because it follows for square orthonormalQwhen the eigen-equation holds.leaderboardmode use the same rechecked benchmark path instead of the previous 1000-repeat ranked loop.Validation
python3 -m py_compile problems/linalg/eigh_v2/eval.py problems/linalg/eigh_v2/reference.py problems/linalg/eigh_v2/task.py problems/linalg/eigh_v2/submissions/torch_eigh.py problems/linalg/eigh_v2/submissions/triton_diagonal_fast_path.py/Users/mark/Dev/kernelbot/.venv/bin/ruff check problems/linalg/eigh_v2git diff --checkkernelbot_eigh_v2_debugon127.0.0.1, with the local checkout registered throughPROBLEM_DEV_DIR/PROBLEMS_REPO.torch_eigh.pylocal submissions on B200:leaderboardsubmission after repeat-budget fix: pass, about 116s end-to-end locally; recorded phase durations included test at 7-10s, benchmark at 34-44s, and leaderboard at 44-48s.Q must be a plain torch.Tensor.Provenance
Resolved problem directory:
problems/linalg/eigh_v2. Ranked/profile shapes come fromeigh_v2/task.ymlbenchmarks:. Profile mode wraps the submitted kernel in the upstreamcustom_kernelNVTX region. Reference-kernels base used for this PR:origin/mainat4a1153e, with this branch at9bcefc4.